Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress tracebacks when SIGTERM signal processing #5939

Closed

Conversation

jkonecny12
Copy link
Member

This is a workaround for
rhinstaller/kickstart-tests#1296

where we are most likely hitting issue that logging is not safe to use in the signal handlers because logging is not reentrant safe. To avoid failures in KS tests and also test if this is the correct issue, let's use this workaround.

The ideal fix should be to not use logging in atexit call which might be tricky to resolve and also we will use a lot of log outputs.

@jkonecny12 jkonecny12 added the f42 Fedora 42 label Oct 15, 2024
@jkonecny12
Copy link
Member Author

/kickstart-test --skip-testtypes whatever

Copy link
Contributor

@M4rtinK M4rtinK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as a hotfix. :)

@rvykydal
Copy link
Contributor

/kickstart-test --skip-testtypes whatever

Seems we are still seeing the same issue (in 2 tests from the run)?

[rvykydal@prnot4 ~/work/kstests/exit_handler_jirka/unpacked]$ tail kstest-part-luks-2.2024_10_15-16_25_22.r74kh4i0/virt-install.log 
16:36:27,564 ERR anaconda:  File "/usr/lib64/python3.13/logging/__init__.py", line 2231, in shutdown
16:36:28,513 ERR anaconda:    for wr in reversed(handlerList[:]):
16:36:28,513 ERR anaconda:  File "/sbin/anaconda", line 297, in terminate
16:36:28,713 WARNING brltty:unsupported generic resource identifier: bluetooth:
16:36:28,786 ERR anaconda:    sys.exit(1)
16:36:28,788 ERR anaconda:SystemExit: 1
16:36:30,340 NOTICE kernel:audit: type=1107 audit(1729010190.337:1126): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='avc:  denied  { stop } for auid=n/a uid=0 gid=0 path="/usr/lib/systemd/system/[email protected]" cmdline="/usr/lib/systemd/systemd-logind" function="bus_unit_method_start_generic" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=service permissive=1 exe="/usr/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?'
16:36:30,349 NOTICE audit:USER_AVC pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='avc:  denied  { stop } for auid=n/a uid=0 gid=0 path="/usr/lib/systemd/system/[email protected]" cmdline="/usr/lib/systemd/systemd-logind" function="bus_unit_method_start_generic" scontext=system_u:system_r:kernel_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=service permissive=1 exe="/usr/lib/systemd/systemd" sauid=0 hostname=? addr=? terminal=?'
16:36:30,428 INFO systemd-logind:Removed session 2.
16:36:33,634 WARNING brltty:unsupported generic resource identifier: bluetooth:
[rvykydal@prnot4 ~/work/kstests/exit_handler_jirka/unpacked]$ tail kstest-network-static-to-dhcp-pre.2024_10_15-16_15_14.r21blxxu/virt-install.log 
16:26:05,617 INFO systemd:Stopped lvm2-lvmdbusd.service - LVM2 D-Bus service.
16:26:05,617 INFO systemd:lvm2-lvmdbusd.service: Consumed 2.078s CPU time, 25M memory peak, 16.1M memory swap peak.
16:26:05,973 ERR anaconda:    h.flush()
16:26:05,973 ERR anaconda:  File "/usr/lib64/python3.13/logging/__init__.py", line 1135, in flush
16:26:05,974 ERR anaconda:    if self.stream and hasattr(self.stream, "flush"):
16:26:05,991 ERR anaconda:  File "/usr/lib64/python3.13/logging/__init__.py", line 1283, in stream
16:26:05,992 ERR anaconda:    @property
16:26:05,992 ERR anaconda:  File "/sbin/anaconda", line 297, in terminate
16:26:06,153 ERR anaconda:    sys.exit(1)
16:26:06,172 ERR anaconda:SystemExit: 1

@jkonecny12
Copy link
Member Author

Seems we are still seeing the same issue (in 2 tests from the run)?

OK, at least we know this is not the solution.

@jkonecny12 jkonecny12 force-pushed the master-logging-error-fixup branch from 60578f9 to 9c6c635 Compare October 16, 2024 08:32
@jkonecny12 jkonecny12 marked this pull request as draft October 16, 2024 08:32
@jkonecny12
Copy link
Member Author

Let's use this PR for testing purpose.

@jkonecny12
Copy link
Member Author

/kickstart-test --skip-testtypes whatever

@jkonecny12
Copy link
Member Author

No fails with the logging traceback here. Running the batch again.

@jkonecny12
Copy link
Member Author

/kickstart-test --skip-testtypes whatever

@jkonecny12
Copy link
Member Author

So we have a confirmation that the issue is not running threads in the background. In that case let's try something different.

This is a workaround for
rhinstaller/kickstart-tests#1296

where we are most likely hitting issue that logging is not safe to use
in the signal handlers because logging is not reentrant safe. To avoid
failures in KS tests and also test if this is the correct issue, let's
use this workaround.

The ideal fix should be to not use logging in atexit call which might be
tricky to resolve and also we will use a lot of log outputs.
@jkonecny12 jkonecny12 force-pushed the master-logging-error-fixup branch from 9c6c635 to 7d6c119 Compare October 17, 2024 16:02
@jkonecny12
Copy link
Member Author

/kickstart-test --skip-testtypes whatever

@jkonecny12
Copy link
Member Author

Let's try to prevent logging to raise exceptions.

@jkonecny12
Copy link
Member Author

We have a full exception first time:

17:49:51,909 ERR anaconda:Exception ignored in atexit callback <function exitHandler at 0x7f72ea8e94e0>:
17:49:51,910 ERR anaconda:Traceback (most recent call last):
17:49:51,910 ERR anaconda:  File "/sbin/anaconda", line 105, in exitHandler
17:49:52,087 ERR anaconda:    util.execWithRedirect("systemctl", ["--no-wall", "poweroff"], do_preexec=False)
17:49:52,087 ERR anaconda:  File "/usr/lib64/python3.13/site-packages/pyanaconda/core/util.py", line 259, in execWithRedirect
17:49:52,202 ERR anaconda:    return _run_program(argv, stdin=stdin, stdout=stdout, root=root, env_prune=env_prune,
17:49:52,202 ERR anaconda:  File "/usr/lib64/python3.13/site-packages/pyanaconda/core/util.py", line 236, in _run_program
17:49:52,202 ERR anaconda:    program_log.debug("Return code of %s: %d", argv[0], proc.returncode)
17:49:52,218 ERR anaconda:  File "/usr/lib64/python3.13/logging/__init__.py", line 1507, in debug
17:49:52,298 ERR anaconda:    self._log(DEBUG, msg, args, **kwargs)
17:49:52,298 ERR anaconda:  File "/usr/lib64/python3.13/logging/__init__.py", line 1664, in _log
17:49:52,304 ERR anaconda:    self.handle(record)
17:49:52,304 ERR anaconda:  File "/usr/lib64/python3.13/logging/__init__.py", line 1680, in handle
17:49:52,304 ERR anaconda:    self.callHandlers(record)
17:49:52,304 ERR anaconda:  File "/usr/lib64/python3.13/logging/__init__.py", line 1736, in callHandlers
17:49:52,357 ERR anaconda:    hdlr.handle(record)
17:49:52,357 ERR anaconda:  File "/usr/lib64/python3.13/site-packages/pyanaconda/anaconda_logging.py", line 61, in handle
17:49:52,457 ERR anaconda:    self.emit(record)       # pylint: disable=no-member
17:49:52,457 ERR anaconda:  File "/usr/lib64/python3.13/site-packages/pyanaconda/anaconda_logging.py", line 109, in emit
17:49:52,457 ERR anaconda:    journal.JournalHandler.emit(self, record)
17:49:52,457 ERR anaconda:  File "/usr/lib64/python3.13/site-packages/systemd/journal.py", line 617, in emit
17:49:52,574 ERR anaconda:    self.send(msg,
17:49:52,574 ERR anaconda:  File "/usr/lib64/python3.13/site-packages/systemd/journal.py", line 466, in send
17:49:52,574 ERR anaconda:    return sendv(*args)
17:49:52,574 ERR anaconda:  File "/sbin/anaconda", line 299, in terminate
17:49:52,574 ERR anaconda:    sys.exit(1)
17:49:52,574 ERR anaconda:SystemExit: 1

In any case it looks that the exception is not that helpful because it's just interrupted stack. The main issue is that SystemExit error should be silently passed in case it is not manually processed for some reason.
https://docs.python.org/3/library/exceptions.html#SystemExit

even exepthook should ignore this exception. The question is what is causing that this exception is printed to the journal. Most probably it is caused by

        # create an appropriately named Journal writing stream
        anaconda_stderr_stream = journal.stream("anaconda", priority=journal.LOG_ERR)
        # redirect stderr of this process to the stream
        os.dup2(anaconda_stderr_stream.fileno(), sys.stderr.fileno())

which is the new code in the Wayland solution which is causing all stderr prints to go to journal. However, that doesn't solve mystery of why it is even printed and what is causing that...

@jkonecny12 jkonecny12 force-pushed the master-logging-error-fixup branch from 7d6c119 to 2ca7f3c Compare October 18, 2024 15:01
@jkonecny12
Copy link
Member Author

/kickstart-test --skip-testtypes whatever

@jkonecny12
Copy link
Member Author

One of the possible solutions could be to remove dup2 call to print stderr to journal to restore previous state. Example of such solution:
https://src.fedoraproject.org/rpms/pyproject-rpm-macros/pull-request/353#_2__41

@jkonecny12
Copy link
Member Author

/kickstart-test --skip-testtypes whatever

1 similar comment
@jkonecny12
Copy link
Member Author

/kickstart-test --skip-testtypes whatever

@jkonecny12
Copy link
Member Author

Should be properly resolved by #5962

@jkonecny12 jkonecny12 closed this Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
f42 Fedora 42
Development

Successfully merging this pull request may close these issues.

3 participants